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Discrete wavelets are applied to parametrization of the intra-chain two-point correlation functions 
of homopolymers in dilute solutions obtained from Monte Carlo simulation. Several orthogonal and 
biorthogonal basis sets have been investigated for use in the truncated wavelet approximation. 
Quality of the approximation has been assessed by calculation of the scaling exponents obtained 
from des Cloizeaux ansatz for the correlation functions of homopolymers with different connectivities 
in a good solvent. The resulting exponents are in a better agreement with those from the recent 
renormalisation group calculations as compared to the data without the wavelet denoising. We 
also discuss how the wavelet treatment improves the quality of data for correlation functions from 
simulations of homopolymers at varied solvent conditions and of heteropolymers. 

PACS numbers: 61.25.Hq, 02.60.-x, 36.20.Ey 



I. INTRODUCTION 

The main purpose of this paper is to give a useful introduction and a practical guide to those who would like to apply 

(2) 

discrete wavelets for treating the data for the intra-chain two-point correlation functions (TPCF) (r), which either 
have been previously computed from direct computer simulations, came from some theoretical technique after solving 
equations for TPCFs, or perhaps have been obtained from X-ray and neutron scattering experiments. The intra- 
chain correlation functions represent a fundamental link between the equilibrium thermodynamic observables and the 
conformational structure of polymers. These functions for polymers exhibit rather different behavior depending on 
the solvent quality. TPCF of a homopolymer in a good solvent follows a universal scaling scaling law [33 for which 
analytical expressions can be derived by the field theoretical and other approaches |lj, |2|, |3j . On the contrary, the 
TPCF in a poor solvent exhibit a complicated oscillating radial dependence akin to that of simple liquids. In this 
case, there is no known simply parametrized representation of TPCF for the homopolymer globule. Moreover, an 
accurate sampling around a rather tall peak corresponding to the first solvation shell becomes very significant as this 
peak contributes most to the thermodynamic observables such as the mean energy. On the other hand, TPCF in a 
good solvent obtained from molecular mechanics simulations tend to be rather noisy due to the high entropy of the 
coil conformation. This results in a large scatter of values of TPCF at small radial separations, which makes further 
fitting of the data by an analytical expression and extraction of the scaling exponents difficult. Therefore, in general, 
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dealing with TPCF data of heteropolymers, for which some monomers are in a good solvent while others are in a 
poor solvent, and, particularly, extracting meaningful information from such data is a rather nontrivial problem. 

Relying on the recent works of some of us 0, 0, Ej , we believe that the task of parameterizing (r) in a compact 
way can be accomplished by means of the multiresolution analysis 0,0- At present, a number of special basis sets, 
referred to as wavelets @, are known and are being actively used for treating both smooth and sharply oscillating 
functions, as well as for denoising of signals Wavelets became a necessary mathematical tool in many modern 

theoretical investigations in Physics, Chemistry and other fields [H [H Q E IH [H E E HJ S3- Wavelets 
are particularly useful in those cases when the result of the analysis of a function should contain not only the list 
of its typical frequencies (scales), but also the list of the local coordinates where these frequencies are important. 
Thus, the main field of applications of wavelets is to analyse and process different classes of functions which are either 
nonstationary (in time) or inhomogeneous (in space). 

The most general principle of the wavelet construction is to use dilations and translations. Commonly used wavelets 
form a complete (bi)orthonormal system of functions with a finite support constructed in such a way. That is why 
by changing a scale (dilations) wavelets can distinguish the local characteristics of a function at various scales, and 
by translations they cover the whole region in which a function is being studied. Due to the completeness of the base 
system, wavelets also allow one to perform the inverse transformation to decomposition, which is called reconstruction. 

In the analysis of functions with a complicated behavior, the locality property of wavelets makes the wavelet 
transform technique substantially advantageous compared to the Fourier transform. The latter provides one only 
with the knowledge of global frequencies (scales) of a function under investigation since the system of the base 
functions used (sine, cosine or imaginary exponential functions) is defined on the infinite range. The special features 
of wavelets such as their (bi)orthogonality and vanishing of moments result in the need for only few approximating 
coefficients in practical applications. That is a reason why wavelets are actively used, for example, to construct 
distribution functions in calculations of the electronic structure [Hi. I20I l2l| as well as in Statistical Mechanics 0,0,0- 

Recently, some of us, have carried out several studies devoted to the wavelet parametrization of the radial den sity 
functions for various atomic and molecular solutes 0, 0, |(|. A model study of the galaxies density in Ref. j22j 
uses a similar wavelet approach for a different problem. In the present work we would like to address the question 
whether wavelets can also be advantageous for approximating the intra-chain correlation functions of homopolymers 

in different solvents. The main practical goal of this paper is to apply discrete wavelets for approximating functions 

(2) 

9ij ( r ) °f °P en j rm g and star homopolymers in a coil conformation, as well as of a globule. In the case of a coil, the 
des Cloizeaux scaling formula applies and a number of accurate theoretical results for the scaling exponents involved 
are available 0, |2^| . Thus, we shall be able to investigate the influence of the choice of the wavelet basis set and of 
the number of terms not only on the quality of the correlation function parameterization, but also on the values of 
the scaling exponents extracted from fitting the wavelet denoised functions by the des Cloizeaux formula. 



II. METHODS 



A. Model 

To obtain the correlation functions we relied on the standard coarse-grained homopolymer model pil . I2H l2l| based 
on the following Hamiltonian in terms of the monomer coordinates, X^: 

The first term here represents the connectivity structure of the polymer with harmonic springs of a given strength 
Kij introduced between any pair of connected monomers (denoted by i ~ j). The second term represents pair-wise 
non-bonded interactions between monomers such as the van der Waals forces, for which we adopt the Lennard- Jones 
form of the potential, 

{+00, r < d 

where there is also a hard core part with the monomer diameter d (below we choose d = I without any lack of 
generality) . 
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We use the Monte Carlo technique with the standard Metropolis algorithm j^], which converges to the Gibbs 
equilibrium ensemble, based upon the implementation described by us in [25|. Value of Vq = will correspond to the 
purely repulsive case (good solvent) leading to a coil conformation of the polymer while Vb = bksT will correspond 
to the attractive case (poor solvent) leading to a globular conformation as in Ref. [23] ■ All details of our Monte Carlo 
procedure have been previously described in the paper |28( and, in fact, here we shall rely on the same set of Monte 
Carlo simulation data in order to make the comparison of the wavelet treated scaling exponents with those of Ref. 
|28|| more straightforward and unambiguous. 



B. Correlation functions 



The intra-chain two-point correlation function (TPCF) of a pair of monomers i and j is defined as, 

4 2) (r) = <*(X< - X, - r)) = -^(5(\Xi - X,| - r)>. (3) 

The second equation establishes that it is a function of radius r = |r| only due to spatial isotropy (SO (3) rotational 
symmetry). We may note that this function should, strictly speaking, be named distribution function, but since 

9ij ( r ) ~^ when r — * 00 because of the chain connectivity, we apply the term 'correlation function' to g\j (r) itself 

rather than to the quantity glj (r)/ (g^) 2 — 1, which would vanish as r — > 00 in the case of simple liquids. The 

function is normalized to unity via: J d 3 rg^\r) = 1. Note that the correlation functions exactly satisfy the excluded 

volume condition, gf, (r) = for r < d, due to the choice of the hard-core part in the non-bonded potential Eq. J5J). 
The mean-squared distance between monomers i and j is, 



A^^X.-X,) 2 ^ J d 3 r|r| 2 5 (f(r), 



(4) 



which we defined here without the traditional factor of 1 /3 as compared to some of the previous papers [2Jj . 

The intra-chain pair correlation functions g\^ (r) are strongly dependent on both the degree of polymerization K 
of the polymer and the choice of the reference monomers i and j, contacts between which we are looking at. However, 
as we have demonstrated in Ref. |28j |. if we introduce the rescaled correlation function in terms of the dimensionless 
variables, 

9ij{r)=D i ( glj'ir), r = r/D z >, (5) 

these will change in about the same range and hence would permit a much more straightforward comparison with 
each other. From this definition, obviously, g\j (?) satisfies the following two normalization conditions: 

dfr 2 g ( i f(f)= drr 4 g^f (r) = — . (6) 



C. Scaling Relations 

According to Refs. 0, Q TPCF of a flexible homopolymer coil in a good solvent can be well described [2^| via a 
power law times a stretched exponential, known as the des Cloizeaux scaling equation, 

gg\f) = A ij r d ^ expt-flyf 5 **). (7) 

Due to the two normalization conditions in Eq. JfJJ constants A and B can be immediately calculated and expressed 
via 9 and S. The exponents 8{j do not really depend on but the contact exponents 6ij do. In the case of the 
end-end correlations of an open chain 6ij is denoted as 0$, and these can be expressed via, 

1 T - 1 

5 = , — , e^ 1 - — , (8) 

1 — v v 

where v has the meaning of the inverse fractal dimension of the system and 7 is related to the number of different 
polymer conformations 0, 0|- 
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D. Wavelet Theory 



The fundamental theory behind wavelets is known as the Multi-Resolution Analysis (MRA) . Most of the rigorous 
results and definitions from MRA are not usually required for practical applications. The only equations which 
are needed for the work described herein will be introduced in this section. As we mainly use basis sets from the 
biorthogonal wavelets families, we shall introduce all wavelets in a general way as biorthogonal wavelets. Moreover, 
we shall use the Discrete Wavelet Transform (DWT) technique 0, El to parameterize the TPCFs. There is a good 
introduction to the wavelet techniques in Ref. 0|. We also will follow the style of that book henceforth. The 
multiresolution approach is based on the idea that the wavelet functions generate hierarchical sequence of subspaces 
in the space of square-integrable functions over the real axis L 2 (R), which forms the MRA. 

The scaling functions tp(r) and (p(r) produce a biorthogonal MRA if they satisfy the following conditions. 

(i) Translates of these functions with integers ip s = tp(r — s), (p s = (p{r — s), s G Z, are linearly independent and 
produce bases of the subspace Vb c L 2 (R) and their dual counterpart Vo C L 2 (R) correspondingly. This means that 
if a function f(r) is contained in the space Vj, its integer translates have to be contained in the same space, 

fir) G Vj f(r + s)e Vj, f(r) G Vj «■ f(r + s) G Vj, a G Z. 

(ii) Dyadic dilates of these functions tpj S = tp{2?r — s), (pj S = Cp(Vr — s), j e Z, generate hierarchical sets of 
subspaces {Vj} and {Vj} , so that: 

oo oo 

Vj C Vj+i, |J Vj is dense in L 2 (R), f] Vj = 0, (9) 

j — — oo j — — OO 

oo oo 

v 3 c %i) U V i is dense in i2 ( R )' D V 3 = °- 

j — — oo j — — oo 

(iii) The sets of functions ifj S (r) and <pj s (r) are biorthogonal to each other. It means that for any s, s' £ Z: 

fj s (r)<Pj s '(r) dr = 8 SS >. 



It means that if a function f(r) is contained in the space Vj, the compressed function f(2r) has to be contained in 
the higher resolution space Vj+i 

f(r) e Vj f(2r) e V j+1 , f(r) e % f(2r) G V j+1 , j G Z. 

(iv) There is a wavelet function ip( r ) an( i its dual wavelet function ip( r ) such that their integer translates = 
ip(r — s), ip s (r) = ''Pi 7 ' ~ s ), an< i dyadic dilates ipj S = ip(2 J r — s), ipj a = ^i^Pr — s), form subspaces Wj and Wj which 
are complementary to Vj and Vj so that: 

Vj+i = Vj e Wj, Vj+i = Vj © Wj, WjiVj, Vj-LWj. (10) 

(v) From the above relations it follows that i 2 (R) can be decomposed into the approximation space Vj and the 
sum of the detailed spaces Wj of higher resolutions j jo: 

oo 

L 2 (M) - VS- ffi W,, (11) 

where jo G Z is a chosen level of resolution. This means that any square-integrable function f(r) can be represented 
as a sum of linear combinations of the reconstruction scaling functions {y>j Q } at a chosen resolution j — jo and the 
reconstruction wavelet functions {ipj} at all finer resolutions j > jo. This can be written as, 

oo 

f(r) = a jo^j s{r) + d j^jsir), (12) 

where the coefficients {aj oS } and {dj S } are obtained as the scalar products with the appropriate dual decomposition 
basis functions, 

a,j s - 



f(r)(p ja (r)dr, d js = / f(r)ip ja (r) dr. (13) 
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The later equation defines the Discrete Wavelet Transform (DWT). 

As ip(r) C Vq and Vq C Vi, (p(r) C Vq and Vq C Vi, we can express ip(r) (as well as (p(r)) as a linear combination 
of the basis functions in Vi (Vi): 

p(r)=J2 h M2r-s), <p(r)=J2h s <p{2r-s). (14) 

s s 

This equation is called the dilation equation. Similarly, ip(r) and ^{r) must satisfy a wavelet dilation equation: 

ip(r) — w s ip(2r — s), ^(r) = w s ip(2r — s). (15) 

The above sets of coefficients are usually called "filters" and they are completely sufficient in order to describe a chosen 
wavelet basis because there are several procedures on how to build up numerical values of the wavelet functions from the 
set of filters 0, 0, E| • We should emphasize here that there are no analytic expressions for biorthogonal (orthogonal) 
wavelets with a finite support [4CJ . These are determined in terms of their filter coefficients only. But one can obtain 
the values of these functions with any given accuracy by using special procedures, which are well described in the 
wavelet literature 0, E| ■ 

The scaling functions and the wavelets have a finite support only in the case of a finite number of the coefficients 
h s and w s - Due to their biorthogonal nature these functions satisfy the relations: 

J <pj a {r)(pjb(r) dr = 8 ab , 

J Via (r)$i b (r)dr = 0, (l>j), (16) 

J ¥>ja(r)ipib(r) dr = 0, (j > I), 

J il>ja(r)4>kb(r) dr = 5j k 5 ab , 

for any integer j, I, a, b. 

If the pairs of the decomposition functions {if, ip} an d the reconstruction functions {ip, ip} are identical, the trans- 
form is called 'orthogonal wavelet transform.' Otherwise we shall talk about a more general 'biorthogonal wavelet 
transform. ' 

In the expansion 1)12(1 the first term gives a 'coarse' approximation for f(r) at the resolution jo an d the second term 
gives a sequence of successive 'details'. In practice, we actually do not need to use the infinite number of resolutions. 
Therefore, the sequence of details is cut-off at an appropriate resolution j max . Since all functions used in numerical 
work are given in a finite interval, the sequence of different translates {s} has also a finite number of terms S. It 
should be mentioned that, really, S can be different for detailed and coarse approximations. 

Importantly, the explicit form of the basis functions is not required if we are using (bi)orthogonal wavelets with a 
finite support and a dyadic set of scales j. Then the coefficients in Eq. (|13fl can be calculated by the Fast Wavelet 
Transform (FWT) algorithm 0, El- The main idea of this algorithm is that a set of (bi)orthogonal discrete 
filters at consequently dilated scales is used for the multi-resolution analysis of a signal. As a result, to calculate the 
approximating coefficients, the convolution of the signal and the relevant filter is only required for each scale, and the 
latter can be easily obtained. 

By choosing relevant basis functions and scales we can nullify most of the coefficients {a} and {d} thereby reducing 
the square root error (SRE) since DWT satisfies the Parseval's identity |{|- Therefore, the function under study can 
be reconstructed with the use of only a few nonzero coefficients without any significant loss of accuracy, making the 
total number of the approximating coefficients rather small. This feature of the of wavelet approximation is widely 
used in processing of signals and images, the data for which should be compressed with minimal losses jl()j . 

E. Choice of wavelet basis set. 

The compression and denoising properties of the wavelet transform strongly depended on the fundamental properties 
of the wavelet bases, which we define here in a rather simplified way as: the number of vanishing moments, regularity, 
size of support, symmetry and orthogonality /biorthogonality. 
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Number of vanishing moments: A wavelet function ifi(r) has N v . m . vanishing moments if: 

Jr^(r)dr = for /i = 0, . . . , N v . m . - 1, (17) 

The number of vanishing moments strongly influences the localization of wavelets in the frequency space. The 
Fourier transform of a wavelet with N v . m . — n has a peak and decays as k~ n (k means frequency). 

Regularity: This can be defined as the number p of existing derivatives of a wavelet function. It also characterizes 
the frequency localization of wavelets. The Fourier transform of a wavelet with regularity p = n decays as 
^-(n+1) £ Qr i ar g e w e W ould like to emphasize that as wavelets have no analytic expressions the definition of 
their derivatives is not as straightforward as for "usual" functions |j. However, these mathematical details are 
beyond of our article. 

Size of support: This is the length of the interval on which the wavelet function has non-zero values. Obviously, 
this characterizes the space localization of the wavelet. 

Symmetry: The wavelet bases functions can be strongly symmetric or asymmetric. The deviation of a wavelet from 
the symmetry (i.e. even or odd parity) is usually measured by how the phase of its Fourier transform deviates 
from a linear function. It was shown that is impossible to construct an orthogonal basis with the exact parity 
of the functions . 

On the contrary, we can design a biorthogonal basis set with the exact symmetry of the function without serious 
efforts H H| . 

Orthogonality/Biorthogonality: As we have already mentioned in the case when the pairs of the decomposition 
functions {ip, ip} and the reconstruction functions {ip, tp} are identical, the wavelet transform is orthogonal. 
Otherwise it is biorthogonal. But this is true only if the {(p, ifj} and {<p,ip} obey the conditions l|ltj[) . We 
should mention that there are several non-orthogonal families of wavelets such as Mexican Hat, Morle, Gaussian 
wavelets and so on 0, 0, . Usually they have infinite support and do not obey exactly the Parseval's identity. 
Therefore such wavelets do not provide an one-to-one reconstruction of a function from the its wavelet expansion 
coefficients. Due to these circumstances we do not use such basis sets in our work. 

Summing up the above, we can conclude that in order to provide good denoising of a signal the wavelets have 
to possess good regularity and as many vanishing moments as possible. From other point of view, they have to be 
well localized in space, which means that they must have a quite short support. Unfortunately, these properties 
are interrelated. Thus, small support implies only few vanishing moments and poor regularity. In addition, the 
orthogonality implies asymmetry of the basis functions, which in turn can lead to some numerical artefacts. Since 
for each concrete task certain wavelet properties are more important than others, there are different wavelet families 
which are optimized for some of these properties. 

For example, in the case of Daubechies wavelets we have a maximum number of vanishing moments and maximal 
asymmetry with fixed length of support, while the Symlet wavelet family has the "least asymmetry" and highest 
number of vanishing moments with a given support width. 

It was shown that it is possible to construct wavelet basis sets with the scaling function having vanishing moments 
of non-zero order with respect to some shifting constant c. Thus, for a given number of vanishing moments N v . m . we 
have: 

J(r~c) n p>{r)dr = 0, < n< N v . m ., 

The Coifman wavelets are compactly supported wavelets which have the highest number of vanishing moments for 
both tp(r) and tp(r) with a given width of support. This property is very useful for the treatment of functions 
with sharp peaks and slopes. The larger the number of the scaling function vanishing moments, the better is the 
approximation for singular points of the function under study |g}. Hence, by using such wavelets (e.g. Coifman) 
we can treat accurately sharp peaks of such a function. On the other hand, these wavelets are rather smooth to 
approximate well the function within the ranges between these peaks. The price for this extra feature is that the 
Coifman wavelets are longer than the Daubechies wavelets. Their length of support is equal to 3A„. m . — 1 instead of 
2N v . m - 1. 

Thus we can see that for orthogonal wavelets the desirable properties are in contradiction to each other. But 
fortunately, we can use different functions for the decomposition and reconstruction. These biorthogonal bases have 
several advantages compared with the orthogonal bases. We can also benefit from the fact that we can use base 
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functions ip, ip with a number of vanishing moments for decomposition, whereas functions 95, ip with a good regularity 
for reconstruction. The former would separate any unpleasant stochastic oscillations of TPCFs leaving this 'noise' 
to the detail coefficients at higher levels of resolution. The latter, on the other hand, would produce a TPCF 
approximation as smooth as possible during reconstruction. If, however, we would prefer to impose both conditions 
of a large number of vanishing moments and regularity on an orthogonal basis, we would have to pay with a support 
at least twice the size that of the biorthogonal basis. Large supports, on the other hand, are known to lead to a 
significant deterioration in the quality of the wavelet approximation 

Him 

In this work we will use biorthogonal bases from two biorthogonal families: Biorthogonal Spline Wavelets whose 
decomposition functions ip(r) are optimized for the number of vanishing moments, but the reconstruction functions 
ip(r) arc optimized in the sense of regularity; the Reverse Biorthogonal Spline Wavelet whose decomposition functions 
ip(r) are optimized to achieve maximal regularity with a given support width and the reconstruction functions ip(r) 
which are constructed in order to gain a maximum number of vanishing moments. In addition, these biorthogonal 
sets have the exact symmetry for all the basis functions. 



F. Wavelet Algorithm 

A typical way of building the wavelet approximation is as follows ^(j ■ The coefficients obtained by FWT are sorted 
in order of the decrease of their absolute values and then only some number L of the largest coefficients are kept by 
nullifying the rest of the coefficients. This is followed by application of the inverse transform (reconstruction). Note 
that the truncation number L depends on the required accuracy of representation of the function in question. However, 
this scheme is difficult to apply because of an undesired intersection between different levels of resolution which often 
arises. The latter leads to a much increased number of coefficients required without any sensible improvement in 
the accuracy. The quality of the resulting approximation is not particularly high because the numerical boundary 
artifacts result in the so-called Gibbs effect, i.e. false oscillations of the approximated function |9(. 

Therefore, we will use instead a 'smarter' strategy in which we employ the following three remarkable circumstances: 

1. For physical reasons the functions gh (f) vanish at f — > (due to the excluded volume effect) and at f — > 00 
(due to a finite size of the molecule) . 

a ("2") 

2. In terms of the rescaled radius, f > 0.75 the functions g\j (f) have a rapid exponential (or even a faster stretched 
exponential) decay. 

3. From physical grounds it is also well known that g\j (r) is differentiable function of a high order for large f. 

4. The multi-resolution nature of the wavelet analysis allows us to treat each level of the wavelet decomposition 
separately. 

We have developed an advanced scheme of the wavelet approximation which, first of all, takes into account the 
peculiarities of TPCF. From another side it relies on the strategy of a 'level- by- level' thresholding, which has been 
independently proposed by several authors 0, |n| . 

By taking into account the asymptotic behavior of TPCFs, we can use the zero boundary conditions while doing 
the wavelet decomposition. Considering the values of g(f — > 0) as zero, we can also nullify all wavelet coefficients 
corresponding to the range [0, 0.05]. Strictly speaking, the upper bound for this cut-off is given by ft = d/ \fDij and 
it depends on the system size and parameters, but the value of 0.05 is well below this bound for all the data considered 
in this paper. As we have decomposition functions with a sufficient number of vanishing moments, we can nullify all 
detail coefficients at all levels of resolution which correspond to the r ang e of rescaled radius f € [0.75, 00) in order 
to extract the trends of our TPCF with a 'maximal smoothness' 0, l30j | . The value for this lower bound f r has the 
meaning of the rescaled radius after which TPCF has a fairly smooth decaying behavior. For other regions of r we 
extract the highest detail coefficients in each levels of resolution separately. 

Summing up all of the above, we propose the following scheme for the TPCF wavelet approximation: 

1. We perform FWT with zero boundary conditions at the largest scale M satisfying the condition J^b \d,M,b\ < 
ej^b l a M,b| (where a good choice for e is 0.05), then all further ^-coefficients can be neglected. 

2. All the coefficients corresponding to the range r S [0, 0.05] (for both the approximation and the detail) are also 
nullified. 

3. We save all the approximation coefficients which remain non-vanishing in the previous steps. 
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4. All the detail coefficients corresponding to r S [0.75, ...,oo) are nullified. 

5. In each level of decomposition we leave the maximal detail coefficients corresponding to the function extrema, 
while neglecting the rest of the coefficients. 

6. We perform the conventional inverse FWT but only for the non-zero coefficients remaining from the previous 
steps. 

7. To suppress the Gibbs effect at the left boundary, the approximated TPCF is set equal to zero up to r cross , 
where r cross is the rightmost nontrivial zero point of the approximated TPCF, i.e., g app {r 'cross) = 0. 

As a result, we have a fast scheme of calculations and a compact approximation for the correlation functions. 

Concerning the choice of the wavelet basis set, we note that to realize FWT there are many suitable sets such as 
Daubechies, Coifman, Symlets, biorthogonal wavelets, and so on [!J,|2!j. We have tested various basis sets, but our 
detailed study presented below indicates that the reverse biorthogonal basis (RB5-5) is the best of them for treatment 
of TPCF for the systems under study. Here we shall follow the Daubechies' notation for this family: the first index 
- Nd = 5 for the decomposition functions, the second index — N r = 5 for the reconstruction ones. These indices 
reflect the number of vanishing moments of ip, namely: N v _ m , — N r — 1, the regularity value of ip, namely p = N r — 1, 
as well as the length of support I for the pairs {ip, ip}: Id = 2 * Nd + 1, and for the pairs {ip, ip}: l r = 2* N r + 1. Fig.Q] 
depicts the functions from the RB5-5 basis set. 

III. RESULTS 

To illustrate the usefulness of our scheme we have investigated the two-point correlation functions of ring, linear and 
star homopolymers in the coil state, as well as of the globular state of a ring homopolymer since the connectivity is not 

(2) 

as important for the latter state. The data for gL (r) has been obtained by Monte Carlo simulations discussed in our 

previous study psj . Fig. [5] depicts typical behavior of §|j (r) for an open homopolymer coil and a ring homopolymer 
globule. As one can see, in the liquid globular state the TPCF has several peaks of increasing width and decreasing 
height located at approximately nd/ WThj (n = 1, 2, . . .). On the other hand, the TPCF of a coil exhibits a smoother 
radial dependence, but suffers from a significant statistical noise. 

The correlation functions obtained from such data are then approximated by the above described wavelet procedure. 
Fig. El shows the difference Ag(f) — g(r) — g app (r) of the TPCFs obtained by simulations and their approximations 
by wavelets (solid curves) and cosines (dashed curves) with the same number of terms L. One can clearly see that the 
wavelet treatment provides a much better approximation than the cosines Fourier treatment. For the coil (L = 20), 
at small radial separations both treatments do show deviations from the simulation data, but these only reflect the 
limitations of sampling statistics of TPCF as the function should really be very smooth and obey the des Cloizeaux 
equation. However, while the wavelet treatment gives an essentially vanishing Ag for larger f, the Fourier treatment 
continues to yield parasitic oscillations at all separations. For the globule (L — 25), which had a much better quality 
of data due to a smaller entropy of the globule, the wavelet treatment gives an essentially vanishing Ag everywhere, 
whereas the Fourier method works very poorly in the whole range with strong oscillations present even for the largest 
of separations. 

In Fig. 0] we present four different levels of the wavelet decomposition of TPCF of an open coil. We can see that the 
smooth part of this function can be well represented by the approximation coefficients. Conversely, the unpleasant 
oscillations are concentrated in the detail coefficients. 

In Fig. [5] we likewise present four different levels of the wavelet decomposition of TPCF for the globule of a ring 
homopolymer. We can see that the smooth part of this function can be mainly represented by the approximation 
coefficients. But there is also an important information in the detail coefficients, which mainly represent the sharp 
peaks of the function. Therefore, our 'smart' level-by-level technique allows us to effectively suppress noise in case 
of the coil and to prevent us from 'oversmoothing' of physical oscillations in case of the globule. 

We have also calculated the mean square norm of the inaccuracy A, which characterizes the quality of the approx- 
imation: A = \f^2l-i(g(fi) — g a pp(?i)) 2 , where fi=i-Sf are the grid points, g(fi) is the 'true' correlation function 
from Monte Carlo data, and g ap p(fi) is the approximated one. Figs. El and [7| depict the dependences of the norm A 
on the number L of the approximating coefficients for the coil and globule states respectively. In what was mentioned 
above we have used the 'RB5-5' basis set Here, for comparison we also depict these dependencies for the Cosine 
and FFT approximation, which are widely used in applications [3ll |. We can see that for a reasonable number of 
coefficients the wavelet approximations gives us a remarkably better accuracy than the conventional methods. 

For the approximated TPCF we have also evaluated the scaling exponents for the coil state. In this case we can 
compare these results with the rather accurate theoretical values obtained from the Borel resumed renormalisation 
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group calculations USE! 03. As in Ref. |28| the fitting has been done via the the nonlinear least-squares (NLLS) 
Marquardt-Levenberg method |34| by means of the fit function in the gnuplot software. Fit reports parameter 
error estimates which are obtained from the variance-covariance matrix after the final iteration. By convention, these 
estimates are called 'standard errors' and they have been reported in Tab. [I] which contains the results for open and 
ring homopolymers. Here we have used the wavelet approximation with L — 20 coefficients. In this table we also 
include the results which are obtained from the fitting of the untreated functions as in Ref. [2^] • The notations in the 
first column follow the des Cloizeaux convention: — end-end monomers, 1 — end-middle, 1' — end-three quarters, 
1" — end-one quarter, and 2 — one quarter-three quarters of the chain respectively. Here and below reported errors 
are those from the fitting procedure only and do not necessarily account for statistical and other simulation errors. 

We can see that the wavelet approximated functions agree with the most recent theoretical values much better. 
Note also that some of the theoretical values in this table have been updated thanks to the more accurate values 
from Refs. 0, as compared to those which we have used in Ref. [2^]. Moreover, we do not even need to 

freeze 8 at the theoretical value in order to extract a more accurate estimate for 9 as we had to do previously. These 
improvements in the results of our fitting arc not surprising given that, as we have mentioned earlier, the coefficients 
cut-off leads to an effective noise suppressing. 

The least-squares fitting of the data {x, y} with a model function y(xi \ a), which depends on the fitting parameters 
a in a nonlinear fashion, in the multi-variate case is a complex problem akin to that of finding the global minimum 
of the merit function x 2 ( a ) — Sj=i a 7 2 (yi ~ y( x i'T a )) 2 with respect to N parameters a, where Oi is the standard 
deviation (error) of the i-th data point. If the data is fairly noisy, the problem of finding the global minimum of \ 2 
becomes complicated as there are many low-lying local minima of this function and its constant value surfaces have 
a complicated topology. The Marquardt-Levenberg method is one of the most popular fitting algorithms which is an 
efficient hybrid of the inverse-Hessian (variable metric) and the steepest descent (conjugated gradients) minimization 
algorithms for x 2 34] . Practically, the iterations need to be stopped after the values of x 2 change less than the 
specified precision and, clearly, the resulting fitted values &fi t may depend on the choice of the initial values ao 
if there are many local minima present, as well as on the weights er. It is not uncommon to find the parameters 
wandering around near the global minimum in a flat valley of complicated topology if the input data was fairly noisy 
|34j . The wavelet treatment renders the initial poorly-defined fitting problem into a well-defined one (which becomes 
essentially independent of the initial parameters choice) by removing the high-pitch statistical noise from the data, 
and thus by simplifying the topology of the constant x 2 surfaces and getting rid of its many artificial local minima. 
At the same time, the variances (squared standard errors) of the fitted parameters and the co- variances between them 
become smaller than for the untreated data as we are now guaranteed to have found the true x 2 global minimum. 

Tab. ITll lists similar exponents for star homopolymers with the number of arms / = 12 in a good solvent. Here we 
have used the wavelet approximation with L — 30 coefficients. We then have compared the results with those from 
the analytical renormalisation group calculations in the so-called 'cone' approximation for 9 from Ref. |3^|. This 
comparison has not been previously made in Ref. j2^| or elsewhere so far, to the best of our knowledge. The agreement 
between the Monte Carlo and theoretical values seems quite reasonable despite the relatively short length of the arms 
and the limitations of the 'cone' approximation. The latter produces the contact exponents only dependent on the 
functionalities /j, fj of the two monomers in question and not on any other parameters of the star, namely 



36^2- 
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As one can see, the quality of the wavelet approximation is rather good for the combined scheme, while the number 
of approximating coefficients is quite small. 

In general, the accuracy of the wavelet approximations with a fixed number of reconstruction coefficients depends 
on the chosen basis set. So far, no exact 'recipe' was given on which base we have to use in a concrete case. Thus, 
we have checked our assumption about one of the dual bases 'RB5-5' by an additional study. As we are especially 
interested in the quality of the scaling exponent calculations we have evaluated the scaling exponent for an open 
homopolymer coil with the use of different bases. These results are presented in Tab. 1 1 1 1 1 We have used the wavelet 
approximations of the end-end TPCFs with the same number of coefficients. We chose for this comparison typical 
representatives of the main wavelet families: Coifman 2, Daubechies 4, Symlet 4, Biorthogonal 5-5 and Discrete Meyer 
wavelets @. 

We can see that the 'Reverse Biorthogonal 5-5' basis set does the approximation better than the other bases. On the 
other hand, other bases, apart from the Discrete Meyer's, also reveal good fitting results compared to the untreated 
TPCF. This means that our current scheme is just one of possible successful choices of the basis set. The situation 
with the Discrete Meyer is easily explained by a too large support length of this basis I = 60 as compared with I = 11 
for 'RB5-5', which is known to lead to strong over-smoothing 
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IV. CONCLUSION 

Our present study indicates that the discrete wavelets is a suitable and powerful instrument for approximating 
the intra-chain two-point correlation functions (TPCF) of different homopolymers in dilute solutions. The wavelet 
technique allows us to extract the scaling properties from fairly noisy data more accurately and reliably than it can be 
done by the direct fitting. The wavelet treatment removes the high-pitch stochastic fluctuations (part of 'statistical 
noise') in the data thereby producing a somewhat 'coarse-grained' approximation of the data. This renders the ill- 
defined multi-variate nonlinear fitting procedure of the untreated data into a well-defined uniquely convergent fitting 
procedure after the wavelet treatment of the data. Naturally, this also reduces the standard deviations of the fitted 
parameters. However, the wavelet treatment does not over-smooth the data by retaining the genuine oscillations as 
we clearly see in the case of the globule, nor does it produce any of the unpleasant artefacts of the truncated Fourier 
approximation. 

We can see that the dual basis set performs particularly well for approximating TPCFs. This is related to the basis 
properties, namely, that the decomposition functions have a maximal number of vanishing moments with a finite 
support, whereas the reconstruction functions are as regular as possible with a given length of support. Moreover, 
the proposed scheme is rather flexible as it is based on the conventional FWT algorithm. One can choose the basis 
set and adjust the number of the coefficients easily for a particular problem. From the results in Tab. IHII we can also 
conclude that by using almost any reasonable basis it is possible to obtain an improvement in the fitting procedure. 

It should be emphasized that the wavelet scheme is rather universal. The scheme of the wavelet approximation 
proposed here allows us to represent the correlation functions with a small number of approximating coefficients not 
only for the coil but also for globular state of the homopolymers. For instance, our procedure yields the relative 
accuracy of the approximation of order —A ~ 0.5 ■ 10~ 3 . Such accurate knowledge of TPCFs can be used for an input 
to the self-consistent calculations of the inter-chain distribution functions in the framework of the density functional 
methods j3^,|3^,|33 and others. 

Due to a compact parameterization and a high accuracy of the approximation by wavelets, we hope that the 
wavelets can be applied not only for approximating the inter-chain distribution functions of polymers, but also 
in order to calculate these functions by the integral equations theory of polymers |38|. The success of the recent 
applications of wavelets to the theory of molecular solutes has indicated that the method is capable of calculating 
the thermodynamic characteristics of solvation rather accurately. 

We believe that further progress in this direction can also be of importance for the novel theories for calculating the 
intra-chain TPCFs of polymers directly from a force field. Some of us are presently working on the Super-Gaussian 
Self-Consistent (SGSC) theory for a single macromolcculc with any two-body Hamiltonian, in which a set of integro- 
diffcrential equations is derived for §y(r) as well as for D^-. In order to reduce the computational expenses in such 
calculations, having a compact and multi-resolution accurate representation for gij(f) is essential. 

Finally, due to the very general nature of the wavelet theory, we hope that wavelets can find other numerous 
applications for describing spatial and temporal dependences of various observables in a number of fields of soft 
condensed matter theory which they have not hereto beneficially influenced. 
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TABLE I: Comparison of the exponents 8 and 6 between the results from the direct fitting of the Monte Carlo data with those 
from their wavelet approximations (subscript ww) and the theoretical results (subscript theor) for open and ring homopolymer 
coils with the degree of polymerization K = 200. Fitted values have been obtained by a four-parametric fit via Eq. J7J. 





5 


&ww 


&theor 


9 


@ww 


$ theor 





2.11 ± 0.07 


2.36 ±0.02 


2.428 ± 0.001 


0.36 ± 0.02 


0.276 ±0.005 


0.271 ± 0.002 


1 


2.23 ± 0.04 


2.36 ±0.02 




0.56 ±0.01 


0.51 ±0.01 


~ 0.46 


1' 


2.42 ± 0.04 


2.39 ±0.02 




0.45 ± 0.01 


0.462 ± 0.003 


0.459 ± 0.003 


1" 


2.04 ±0.08 


2.39 ±0.02 




0.68 ± 0.03 


0.52 ±0.02 


~ 0.46 


2 


2.39 ± 0.07 


2.40 ± 0.02 




0.81 ± 0.02 


0.80 ±0.01 


0.80 ±0.01 


Ring 


2.46 ± 0.07 


2.40 ± 0.02 




0.79 ± 0.006 


0.815 ±0.005 


0.80 ±0.01 



TABLE II: Values of the exponents 5 and 8 for star homopolymers with f — 12 arms and (N — l)/f = 50 arm length in a 
good solvent. The following notations for the monomer pairs have been adopted: a : n,b : m, where a, b number arms and n, m 
number monomers within arms and refers to the core monomer. Other notations are the same as in Tab. □ 
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& WW 


6 


@ww 


$ theor 


0, a:m=25 


2.67 ±0.03 


2.67 ±0.002 


3.036 ± 0.04 


3.024 ± 0.02 


2.677 


0, a:m=50 


2.56 ± 0.03 


2.62 ±0.01 


0.59 ±0.02 


1.51 ±0.01 


1.442 


a:n=25, a:m=50 


2.40 ± 0.06 


2.48 ± 0.004 


0.55 ±0.01 


0.50 ±0.003 


0.458 


a:n=50, b:m=50 


2.30 ±0.03 


2.18 ±0.01 


0.23 ±0.01 


0.26 ±0.007 


0.277 



TABLE III: Comparison of the exponents S and 9 between the theoretical results, different wavelet approximations (with 
names of the wavelet bases are given in the first column), and the untreated results from the direct fitting of Monte Carlo data 
for the end-end TPCFs of an open homopolymer coil with the degree of polymerization K — 200. 





5 


6 


Theoretical 


2.428 ±0.001 


0.271 ± 0.002 


'RB5-5' 


2.36 ±0.02 


0.276 ± 0.005 


'BR5-5' 


2.36 ±0.02 


0.31 ±0.01 


Daubechies 4 


2.35 ±0.02 


0.29 ±0.02 


Symlet 4 


2.34 ±0.02 


0.28 ±0.02 


Coifman 2 


2.36 ±0.02 


0.285 ± 0.005 


Discr. Meyer 


2.1 ±0.01 


0.4 ±0.01 


Untreated 


2.11 ±0.07 


0.36 ± 0.02 
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FIG. 1: Reverse Biorthogonal Spline Wavelets 5-5. The abscissa is the real numbers axis (x 6 W). At the top are the 
decomposition scaling function Cp and the wavelet function if) and at the bottom are the corresponding reconstruction functions 
ip and ip. Here and in all other figures the axes are depicted in dimensionless units. 




FIG. 2: Rescaled correlation function of the homopolymers with the degree of polymerization K — 200. The solid curve 
corresponds to the end-end correlations of of an open homopolymer in the coil state. The dashed curve corresponds to the 
globule of a ring homopolymer with K = 200 and for \i — j\ — 100. 
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FIG. 3: The difference Ag(f ) between the TPCF obtained from simulations and their approximations by wavelets (solid curve) 
and cosines (dashed curve) . The upper part of the figure corresponds to the coil of an open homopolymer, while the lower one 
to the globule of a ring homopolymer of the same lengths as in Fig. [5] 
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FIG. 4: Four different levels of the wavelet decomposition of the end-end TPCF of an open homopolymer with K = 200. 
At the top there are approximating coefficients {a} at the level jo = 0. The detail coefficients {d} are presented in the ascending 
order in the level of the resolution j vs the shift parameter s. 
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FIG. 5: Four different levels of the wavelet decomposition of TPCF of the homopolymer globule of the same lengths and for the 
same chain indices as in Fig. At the top there are approximating coefficients {a} at the level jo = 0. The detail coefficients 
{d} are presented in the ascending order in the level of the resolution j vs the shift parameter s. 
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FIG. 6: Square root error A between the rescaled end-end TPCF of a homopolymer ring in the coil state with the degree of 
polymerization K — 200 and its approximations. The curves correspond to the wavelet approximation (solid line), FFT ap- 
proximation (dashed line) and cosine approximation (dash-dotted line) . The X-axis is the total number L of the approximation 
coefficients used. 
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FIG. 7: Square root error A between the rescaled TPCF of a homopolymer ring with K — 200 and \i —j\ — 100 in the globular 
state and its approximations. 

The curves correspond to the wavelet approximation (solid line), FFT approximation (dashed line) and cosine approximation 
(dash-dotted line). The X-axis is the number L of approximation coefficients used. 



